Statistical modeling for COVID 19 infected patient’s data in Kingdom of Saudi Arabia

The objective of this study is to construct a new distribution known as the weighted Burr–Hatke distribution (WBHD). The PDF and CDF of the WBHD are derived in a closed form. Moments, incomplete moments, and the quantile function of the proposed distribution are derived mathematically. Eleven estimate techniques for estimating the distribution parameters are discussed, and numerical simulations are utilised to evaluate the various approaches using partial and overall rankings. According to the findings of this study, it is recommended that the maximum product of spacing (MPSE) estimator of the WBHD is the best estimator according to overall rank table. The actuarial measurements were derived to the suggested distribution. By contrasting the WBHD with other competitive distributions using two different actual data sets collected from the COVID-19 mortality rates, we show the importance and flexibility of the WBHD.


Introduction
The creation of effective statistical models for natural and real-life occurrences that may be represented by established statistical probability distributions is one of the fundamental goals of statistics. Where the probability distributions are being utilised to simulate the unpredictable and potentially dangerous life occurrence that is of interest to the researcher. Because of the complexity and difficulties involved in simulating real life occurrences using standard distributions, a large number of other probability distributions have been devised.
Sometimes, the known and accessible probability distributions continue to be unable to adequately reflect and describe the facts for particular natural events. This may be frustrating. The generalised probability distributions are the ones that end up being expanded and modified as a consequence of these changes and expansions. For more readings see [1].
The addition of a few new or additional parameters to well-known probability distributions improved the applicability of those distributions for the data pertaining to natural events and raised the accuracy with which they presented the tail shape of the distribution. There are various helpful methods to expand and increase the flexibility of the traditional statistical distributions. One of these ways is by including an extra parameter in the distribution. One example of this is the power (P) transformation. In this article, the two-parameter WBHD, which has a variety of interesting traits, is obtained by referring to the distributions discussed earlier in the subject. Because it may be skewed to the right as positive skewed, skewed to the left as negative skewed, or symmetric, the implemented WBHD does have a PDF that is more flexible. This provides for extra tail flexibility. It can mimic decreasing, rising, bathtub, and reverse-J hazard rates as well as other hazard rate scenarios. In addition to that, the distribution that has been proposed has an exact closedform CDF and PDF can be managed with relative ease. Because of these benefits, the distribution has a promising potential for applications in a wide range of industries, such as biotechnological life testing, durability, and econometric data. For more readings see [2][3][4][5][6][7].
Recent years have seen an uptick in the number of writers interested in developing novel lifetime distributions for the purpose of fitting actual lifetime data. One of them is: [8][9][10][11][12][13][14][15][16]. On the other hand, it is common knowledge that order statistics may deal and apply with the applications and attributes of random variables and of functions associated to them see [17][18][19] for reference.
Whether we need to have this distribution is the most important issue.
In order to answer this question, we will briefly summarise the relevance of the WBHD: (i) The statistical functions of the WBHD may be expressed in a straightforward and closed-form manner. (ii) the features of the WBHD may be inferred clearly without the need of any special and particular mathematical functions; and (iii) the proposed WBHD provides more flexibility than the existing distributions in terms of the form of the hazard rate function. (iv) The proposed model is capable of fitting different kinds of data such as medical data and engineering data, as well as actuarial data which gives it a very interesting usage in many fields of sciences.
The following constitutes the presentation of this article: We provide the suggested distribution WBHD in Portion 2, along with its PDF and CDF functions. The graphical plots of the PDF and HRF are also presented in this section of the paper. In section 3, we establish several statistical features that are relevant to the WBHD. Eleven traditional approaches to estimating were discussed in Section 4. Also, in Section 4, the simulation research along with its numerical findings were carried out. Risk measures of our proposed model were discussed mathematically in Section 5. In Section 6, we are now going to do the real data analysis. Section 7 of this study piece is where the concluding observations are presented.

Formulation of the WBHD
In this section we define the formulation of the proposed model. Using the cumulative distribution function (CDF) of the WD-G [20], so we can define the CDF of the two-parameter WBHD as follows and its probability density function (PDF) is defined as follows Now we will graph all possible shapes of the PDF of the WBHD and HRF of the WBHD. In Fig 1, we provided three different possible shapes for the PDF of the WBHD: an increasing function, a decreasing function, and a unimodal function. Also, we provided the possible shapes of the HRF of the WBHD in Fig 2.

Statistical properties
Defining the mathematical properties of the proposed model is very essential and important to study the behaviour of the model and making computation easy, also generating data from the proposed distribution depends on the quantile function. This section contains a mathematical discussion of the statistical properties of the WBHD.

Quantile function
In order to calculate the quantile function (QF) of the WBHD, you must determine the inverse function of the CDF (1). This may be done as the following where 0 < p < 1 and W(�) is Lambert function. It used to find the WBHD quarterlies, and to have randomly generated data sets by the following relation

Linear representation
The CDF 1 and the PDF 2 of the proposed model can be linearly represented by using the following expansion logð1 þ xÞ ¼ P 1 j¼1 ðÀ 1Þ jþ1 x j j as follows where D j ¼ ðÀ 1Þ jþ1 jlogð2Þ and G j ðxÞ ¼ 1 À e À ax xþ1 � � aj follows the exponentiated Burr-Hatke distribution (ExBHD).

Moments
The qth moments of the WBHD has the form where hða j; i; kÞ ¼ ðÀ 1Þ iþk GðajÞ j!GðajÀ iÞ iþkþ1 k À � . Setting q = 1, 2, 3, and 4, respectively, we obtain the first four moments about the origin of the WBHD. The nth central moment of X, say μ n , follows as The cumulants (k n ) of X can be obtained as the following

Incomplete moments
The dth incomplete moment of WBHD is calculated as follows Many fields in our life may find great use for Lorenz curve which can be obtained by incomplete moments, LðpÞ ¼ m , x p is the quantile function. One other use for the first incomplete moment is to calculate both the mean residual life and the mean waiting time, both of which are calculated using by m 1

Order statistics
The PDF and CDF of the ith order statistic for the WBHD are

Methods of estimation
This section discusses eleven techniques for estimating the WBHD's parameters, θ = (a, α) > , and compares them using Monte Carlo simulations. To determine the estimates of θ in the following approaches, the AdequacyModel package for the the R software offers a thorough and effective universal meta-heuristic optimization method for maximizing or minimizing an arbitrary objective function. Visit https://rdrr.io/cran/AdequacyModel/ for more information.

Classical methods of estimation
i. With respect to the WBHD parameters, the maximum likelihood estimation (MLE) is calculated by maximizing the log-likelihood function, which is described as follows (x 1 , . . ., x n is a random sample from WBHD) ii. The Anderson-Darling estimation (ADE) is used to calculate the WBHD estimated parameters by minimizing the following equation ( iii. The right-tail Anderson-Darling estimation (RADE) is used to calculate the WBHD estimated parameters by minimizing the following equation ( iv. The left-tailed Anderson-Darling estimation (LTADE) is used to calculate the WBHD estimated parameters by minimizing the following equation ( v. The Cramér-von Mises estimation (CVME) is used to calculate the WBHD estimated parameters by minimizing the following equation (

PLOS ONE
Statistical modeling for COVID -19 vi. The least-squares estimation (LSE) is used to calculate the WBHD estimated parameters by minimizing the following equation (x (1)  vii. The weighted least-squares estimation (WLSE) is used to calculate the WBHD estimated parameters by minimizing the following equation ( viii. The maximum product of spacing estimation (MPSE) is used to calculate the WBHD estimated parameters by maximizing the following equation ( ix. The minimum spacing absolute distance estimation (MSADE) is used to calculate the WBHD estimated parameters by minimizing the following equation ( x. The minimum spacing absolute-log distance estimation (MSALDE) is used to calculate the WBHD estimated parameters by minimizing the following equation ( xi. The percentile estimation (PE) is used to calculate the WBHD estimated parameters by minimizing the following equation (

PLOS ONE
Statistical modeling for COVID-19

Monte Carlo simulations
Using the results of simulations, we investigate how well the estimate approaches the initial values of the WBHD parameters. We will use the sample sizes n = 25, 60, 100, 200, 300, and 500, as well as various parameter values. We create N = 1, 000 random samples from the WBHD, and then we use the software package R to compute the average absolute biases (ABBs), mean square errors (MSEs), and mean relative estimates (MREs). We explore the efficiency of the aforementioned estimate methodologies for computing the WBHD parameters using the simulation data. We will use several parameter values and the sample sizes n = 25, 60, 100, 200, 300, and 500. Using the software package R, we generate N = 1, 000 random samples from the WBHD and calculate the average absolute biases (ABBs), and mean relative estimations (MREs), MREs ¼ 1 N P N i¼1 jŷ i À y i j=y i . We find that the MPSE and MLE techniques are the best ways for estimating randomly generated data sets from WBHD, followed by the ADE method. Tables 1-5 give the numerical results of our simulation, while Table 6 reports ranks of each estimated method.

Concluding remarks on simulation results
1. After recording the results we found that as the sample size get larger the MSE diminishes gradually 2. After recording the results we found that as the sample size get larger the MRE diminishes gradually 3. After recording the results we found that as the sample size get larger the BIAS diminishes gradually 4. By referring to Table 6 we can see that the best estimation method is the MPSE as it has the lowest overall rank. Table 6 we can see that the second best estimation method is the MLE as it has the second lowest overall rank.

Risk measures
In this section we study some risk measures for WBHD. One of this measures is value at risk (VR) which refers to to a quantitative total of the cumulative loss distribution (see Artzner [21]). It is defined for WBHD as follows The second risk measure is called tail value at risk which is used to estimate the worth of a prospective loss when an event occurs outside of the predetermined probability and it is defined for WBHD as follows VaR q x f ðxÞdx ¼ ¼ :

PLOS ONE
Statistical modeling for COVID-19

Numerical simulations for risk measures
In this subsection some results for risk measures for WBHD and BHD are discussed. Tables 7  and 8 presented numerical values of the two risk measures which are determined for both of WBHD and BHD, also, these results are presented graphically in in Figs 3 and 4. From these tables, we conclude that our proposed model have larger values for the two measures compared with BHD, so we can say that the WBHD fits heavy tailed model than BHD and it can be used for modeling insurance data set and other heavy tailed real data sets.

Analysis of COVID-19 real data sets
The usage of COVID-19 actual world data sets in this section demonstrates the distribution's adaptability. The first data set provides the COVID-19 mortality rate from Saudi Arabia for a  period of forty days, from the 22nd of July to the 30th of August 2021. The second real data set on the mortality rate COVID-19 statistics belongs to Saudi Arabia and covers a period of 32 days, which is recorded from the 15th of September 2020 to the 16th of October 2020. Both of the two real data sets are available at https://covid19.who.int/. We will examine WBHD in contrast to a variety of well-known models, such as Burr-Hatke distribution (BHD) [22], inverse power Burr-Hatke distribution (IPBHD) [23], logarithmic Burr-Hatke exponential distribution (LBHED) [24], alpha power exponential distribution (APED) [25], Frechet distribution (FD), exponential distribution (ED), Lindley distribution (LND), Lomax distribution (LD), Frechet Weibull distribution (FWD) [15], and Maxwell distribution (MD), in order to show how flexible WBHD is. We make use of a variety of analytical criteria in order to identify which model is the most suited to employ with the COVID-19 actual data sets. These criteria are Akaike information criterion (A 1 ), the correct Akaike information criterion (A 2 ), Bayesian information criterion (A 3 ), Hannan information criterion (A 4 ). We also consider other information on the model's overall goodness-of-fit, including Anderson Darling (F 1 ), Cramer-von Mises (F 2 ), and Kolmogorov-Smirnov (F 3 ) with its p-value (F 3 (p)). The best model for fitting the COVID-19 real data sets is the one with the smallest values of these measures, with the exception of G3(p) the model with large value is the best model.
For the two COVID-19 actual data sets that were taken into consideration for assessment, analytical measurements as well as MLE estimations and their accompanying standard errors (SE) are provided, respectively, in Tables 9 and 10. As a direct result of this, we could arrive at  Table 7.

Fig 4. Graphs of the VR and TVR by the values in
https://doi.org/10.1371/journal.pone.0276688.g004 the conclusion that the WBHD model performs far better than the other models that are comparable to it. The two COVID-19 actual data sets that are shown in Figs 5 and 6, respectively, are fitted with WBHD using the P-P plot as well as the fitted PDF, CDF, and SF plots. Figs 7 and 8 demonstrate, respectively, for the two COVID-19 real data sets, the behavior of the loglikelihood function with estimated parameters, which is a unimodal function for each value of the estimated parameters.

Concluding remarks on the application results
1. The WBHD is more flexible than our family baseline model (BHD) for fitting the two COVID-19 actual data sets.
2. For the two COVID-19 actual data sets that were taken into consideration for assessment, analytical measurements as well as MLE estimations and their accompanying standard errors (SE) are provided, respectively, in Tables 9 and 10, which provides us that our propose model is the best model for fitting the analyzed real data sets.
3. As a direct result of analyzing of the two COVID-19 actual data sets, we could arrive at the conclusion that the WBHD model performs far better than the other models that are comparable to it.

Conclusion
A new lifetime distribution, which was given the name WBHD, was presented in this paper. We derived its statistical properties. For the purpose of obtaining point estimates for the unknown WBHD parameters α, a eleven traditional estimation approaches were taken into consideration. A simulation research was carried out using R software, allowing for the comparison of the effectiveness of various estimating approaches. Two different sets of COVID-19 data were used to illustrate the benefits of the suggested distribution. In comparison to most of  the other distributions under consideration, it was discovered that WBHD best suited the data. In addition to this, it can be shown in Figs 7 and 8 that the log-likelihood function has global maximum roots.

Future work
In the work that will be done in the future, we may make use of the model that was presented to model various actual data sets in a number of different areas, such as reliability engineering, survival analysis, and so on. Additionally, the WBHD may be expanded to include the introduction of bivariate WBHD's, and this expansion can then be used to the modelling of actual data sets. It is possible to have a discussion about using Bayesian estimation to determine the